[feat] Make hdfs function compatible with tensorflow-io and TensorFlow version earlier than 2.6 by using FileSystem in TensorFlow Env. Also support S3 and so on now. #281

MoFHeka · 2022-09-28T19:15:49Z

Description

[feat] Make hdfs function compatible with tensorflow-io and TensorFlow version earlier than 2.6 by using FileSystem in TensorFlow Env.

After TF 2.6, TF file system was migrated from TF main repo to TF io repo. So when a TF add-on need to use file system, it's better to get file system pointer from TF env for compatibility.
Also support S3 and so on now.

Also add GPU/Redis backend implementation.

Also modify GPU/CPU hash table code(rehashifneeded/dump) to make them more decoupled.

An insert is typically performed after the rehash_if_needed call, but for unknown reasons with nvhashtable, the insert may have unknowable consequences that result in NaN values for the training parameters. Avoid unknown errors by calling cudaDeviceSynchronize() for purposes such as cache synchronization before insert kernels luanching.

Type of change

Checklist:

I've properly formatted my code according to the guidelines
- By running yapf
- By running clang-format
This PR addresses an already submitted issue for TensorFlow Recommenders-Addons
I have made corresponding changes to the documentation
I have added tests that prove my fix is effective or that my feature works

How Has This Been Tested?

Compile and unit test it in different TF version.

tensorflow_recommenders_addons/dynamic_embedding/core/kernels/lookup_impl/lookup_table_op_cpu.h

tensorflow_recommenders_addons/dynamic_embedding/core/kernels/cuckoo_hashtable_op_gpu.cu.cc

tensorflow_recommenders_addons/dynamic_embedding/core/kernels/lookup_impl/lookup_table_op_gpu.h

tensorflow_recommenders_addons/dynamic_embedding/core/kernels/lookup_impl/lookup_table_op_cpu.h

.github/workflows/make_wheel_macOS_arm64.sh

tensorflow_recommenders_addons/dynamic_embedding/core/kernels/lookup_impl/lookup_table_op_cpu.h

rhdong · 2022-10-26T02:43:06Z

tensorflow_recommenders_addons/dynamic_embedding/core/kernels/lookup_impl/lookup_table_op_gpu.h

  virtual void get(const K* d_keys, ValueType<V>* d_vals, bool* d_status,
                   size_t len, ValueType<V>* d_def_val, cudaStream_t stream,
                   bool is_full_size_default) const {}
  virtual size_t get_size(cudaStream_t stream) const { return 0; }
  virtual size_t get_capacity() const { return 0; }
-  virtual void remove(const K* d_keys, size_t len, cudaStream_t stream) {}
-  virtual void clear(cudaStream_t stream) {}
+  virtual void remove(const K* d_keys, size_t len, cudaStream_t stream) const {}


Why do you change here? remove & clear should not be const.

rhdong

LGTM

…w version earlier than 2.6 by using FileSystem in TensorFlow Env. Change hdfs file system to all file system. Now also support S3, local and so on. Also change Redis backend function to make it more low coupling with TF. Also modify GPU/CPU hash table code(rehashifneeded/dump/cpu insert) to make them more decoupled and stable.

tensorflow_recommenders_addons/dynamic_embedding/core/kernels/lookup_impl/lookup_table_op_gpu.h

but for unknown reasons with nvhashtable, the insert may have unknowable consequences that result in NaN values for the training parameters. Avoid unknown errors by calling cudaDeviceSynchronize() for purposes such as cache synchronization before insert kernels luanching.

rhdong

LGTM

luliyucoordinate

LGTM

rhdong · 2022-10-31T04:03:02Z

Excellent job! Thanks, @MoFHeka !

MoFHeka requested a review from rhdong as a code owner September 28, 2022 19:15

MoFHeka requested a review from luliyucoordinate September 28, 2022 19:16

MoFHeka force-pushed the master-dev branch from c146939 to 9bea2ec Compare September 28, 2022 19:39

rhdong reviewed Sep 29, 2022

View reviewed changes

tensorflow_recommenders_addons/dynamic_embedding/core/kernels/lookup_impl/lookup_table_op_cpu.h Outdated Show resolved Hide resolved

MoFHeka force-pushed the master-dev branch from 9bea2ec to ba85490 Compare September 30, 2022 08:29

rhdong force-pushed the master-dev branch from ba85490 to 1e38988 Compare October 3, 2022 01:53

luliyucoordinate previously approved these changes Oct 8, 2022

View reviewed changes

rhdong reviewed Oct 9, 2022

View reviewed changes

tensorflow_recommenders_addons/dynamic_embedding/core/kernels/cuckoo_hashtable_op_gpu.cu.cc Outdated Show resolved Hide resolved

rhdong reviewed Oct 9, 2022

View reviewed changes

tensorflow_recommenders_addons/dynamic_embedding/core/kernels/cuckoo_hashtable_op_gpu.cu.cc Outdated Show resolved Hide resolved

rhdong reviewed Oct 9, 2022

View reviewed changes

tensorflow_recommenders_addons/dynamic_embedding/core/kernels/lookup_impl/lookup_table_op_gpu.h Outdated Show resolved Hide resolved

rhdong reviewed Oct 9, 2022

View reviewed changes

tensorflow_recommenders_addons/dynamic_embedding/core/kernels/lookup_impl/lookup_table_op_gpu.h Outdated Show resolved Hide resolved

luliyucoordinate reviewed Oct 9, 2022

View reviewed changes

tensorflow_recommenders_addons/dynamic_embedding/core/kernels/lookup_impl/lookup_table_op_cpu.h Outdated Show resolved Hide resolved

MoFHeka dismissed luliyucoordinate’s stale review via 018f5da October 11, 2022 21:25

MoFHeka force-pushed the master-dev branch 8 times, most recently from f278091 to 0323ee3 Compare October 13, 2022 21:07

MoFHeka requested review from luliyucoordinate and rhdong October 14, 2022 03:21

rhdong reviewed Oct 14, 2022

View reviewed changes

.github/workflows/make_wheel_macOS_arm64.sh Show resolved Hide resolved

luliyucoordinate reviewed Oct 15, 2022

View reviewed changes

tensorflow_recommenders_addons/dynamic_embedding/core/kernels/lookup_impl/lookup_table_op_cpu.h Outdated Show resolved Hide resolved

MoFHeka force-pushed the master-dev branch 2 times, most recently from 9642b5f to a65c567 Compare October 15, 2022 05:02

MoFHeka requested review from rhdong and luliyucoordinate October 15, 2022 10:37

MoFHeka requested review from rhdong and luliyucoordinate October 20, 2022 14:19

MoFHeka force-pushed the master-dev branch 9 times, most recently from a5fd535 to 36540ce Compare October 26, 2022 00:16

rhdong reviewed Oct 26, 2022

View reviewed changes

rhdong previously approved these changes Oct 26, 2022

View reviewed changes

MoFHeka dismissed rhdong’s stale review via 3bb5dfe October 26, 2022 09:26

MoFHeka force-pushed the master-dev branch from 36540ce to 3bb5dfe Compare October 26, 2022 09:26

luliyucoordinate previously approved these changes Oct 26, 2022

View reviewed changes

MoFHeka dismissed luliyucoordinate’s stale review via 85f7ea8 October 27, 2022 08:55

MoFHeka force-pushed the master-dev branch 2 times, most recently from 85f7ea8 to 6a37b0c Compare October 28, 2022 03:53

rhdong force-pushed the master-dev branch from 6a37b0c to d8e65ea Compare October 28, 2022 04:04

MoFHeka force-pushed the master-dev branch from d8e65ea to 4d56f96 Compare October 28, 2022 04:15

MoFHeka force-pushed the master-dev branch from 266440d to 1640b21 Compare October 28, 2022 04:20

rhdong reviewed Oct 28, 2022

View reviewed changes

tensorflow_recommenders_addons/dynamic_embedding/core/kernels/lookup_impl/lookup_table_op_gpu.h Show resolved Hide resolved

MoFHeka force-pushed the master-dev branch from 1640b21 to 73df931 Compare October 28, 2022 08:15

rhdong approved these changes Oct 28, 2022

View reviewed changes

luliyucoordinate approved these changes Oct 29, 2022

View reviewed changes

rhdong merged commit f640888 into tensorflow:master Oct 31, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[feat] Make hdfs function compatible with tensorflow-io and TensorFlow version earlier than 2.6 by using FileSystem in TensorFlow Env. Also support S3 and so on now. #281

[feat] Make hdfs function compatible with tensorflow-io and TensorFlow version earlier than 2.6 by using FileSystem in TensorFlow Env. Also support S3 and so on now. #281

MoFHeka commented Sep 28, 2022 •

edited

Loading

rhdong Oct 26, 2022

rhdong left a comment

rhdong left a comment

luliyucoordinate left a comment

rhdong commented Oct 31, 2022

[feat] Make hdfs function compatible with tensorflow-io and TensorFlow version earlier than 2.6 by using FileSystem in TensorFlow Env. Also support S3 and so on now. #281

[feat] Make hdfs function compatible with tensorflow-io and TensorFlow version earlier than 2.6 by using FileSystem in TensorFlow Env. Also support S3 and so on now. #281

Conversation

MoFHeka commented Sep 28, 2022 • edited Loading

Description

Type of change

Checklist:

How Has This Been Tested?

rhdong Oct 26, 2022

Choose a reason for hiding this comment

rhdong left a comment

Choose a reason for hiding this comment

rhdong left a comment

Choose a reason for hiding this comment

luliyucoordinate left a comment

Choose a reason for hiding this comment

rhdong commented Oct 31, 2022

MoFHeka commented Sep 28, 2022 •

edited

Loading